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Abstract. The trace subshift of a cellular automaton is the subshift of 
all possible columns that may appear in a space-time diagram. In this 
paper we study conditions for a sofic subshift to be the trace of a cellular 
automaton. 
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1 Introduction 

Cellular automata are well-known formal models for complex systems. They are 
used in a huge variety of different scientific fields including mathematics, physics 
and computer science. 

A cellular automaton (CA) consists in an infinite number of identical cells 
arranged on a regular lattice indexed by Z. Each cell is a finite automaton which 
state takes value in a finite set A. All cells evolve synchronously according to 
their own state and those of their neighbors. 

The study and classification of the evolutions of cellular automata is one of 
the standing open problems in the field [1-6]. Indeed, the simple definition of CA 
contrasts the wide variety of their evolutions. An interesting idea is to classify 
these behaviors according to some notion of complexity. Of course, the word 
"complexity" means different things to different researchers. For this reason, 
in literature one finds classifications according to topological entropy, measure 
theory, dimension theory, attractors, algorithmic complexity, etc. 

In this paper we follow a formal languages approach. Each CA is associated 
with a language. The idea is that the more complex is the language, the more 
complex is the automaton. 
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Topological Properties, Chaos and Associated Formal Languages", by the ANR 
Blanc "Projet Sycomore". 
** Corresponding author. 



The associated language is defined as follows (see Section 2 for more precise 
definitions). Each CA can be seen as a discrete dynamical system (A 2 , F), where 
F is the global function. Let (3 — j3\, fa, ■ ■ ■ , 0k a finite partition of A 7 " . Then an 
orbit of initial condition x, Of(x) = (x, F(x), . . . , F n (x), . . .) can be associated 
with the infinite word w such that Vn 6 N,iu„ = i if F n (x) G /?,. Then, w is 
the (3-trace of F with initial condition x. The (3-trace set £ of F is the set of 
/3-traces with all possible initial conditions. Remark that when A z is endowed 
with the Cantor topology (see Section 2), £ is a closed shift-stable set i.e. a 
subshift. The language C(£) of factors occurring in configurations of £ is the 
language associated with (A z , F). 

In [7], Kurka classified factor subshifts of CA according to their language 
complexity. He devised three classes: bounded periodic; regular but not bounded 
periodic; not regular. In this paper we address a somewhat complementary ques- 
tion, namely, given a subshift £ of a certain language complexity we wonder if 
it can be the trace of a CA. 

Motivations come both from classical symbolic dynamics but also from 
physics. Indeed, when observing natural phenomena due to physical constraints, 
one can keep trace only of a finite number of measurements. This set of measure- 
ments, usually, takes into account only a minor part of the parameters ruling 
the phenomenon under investigation. Hence, to some extent, what is observed is 
the "trace" of the phenomenon left on the "instruments" rather than the whole 
phenomenon in its globality. 

It is a very important issue (Galilean principle) to find a formal model which 
can reproduce the observed trace. Restating those reasonings in our context: 
given a subshift £, one wonders which discrete dynamical system can produce 
it. In particular, one can ask if there exists a CA having £ as a trace. 

Giving a complete answer to this question seems very hard. In this paper we 
give some sufficient conditions for a regular language to be traceable. The proof 
is constructive. We believe that the construction of the CA is of some interest 
in its own. 

Because of the lack of space some of the proofs are omitted. They can be 
found in the Appendices. 

2 Definitions 

Let N* = N \ {0}. For i,j G N with i < j, [i,j] denotes the set of integers 
between i and j. For any function F from A z into itself, F n denotes the n-fold 
composition of F with itself. A set S C A z is F-stable if F(S) C S. 

Languages. Let A be a finite alphabet with at least two letters. A word is a finite 
sequence of letters w — w ■ ■ ■ w\ w \-\ G A*. Its reverse w is w^ w \_ 1 . . . w and its 
rotation 7(w) is u>i . . .wi w i_iWo- A factor of a word w — wq . . . w\ w \_ 1 G A* 
is a word twpj] = Wi...Wj, for < i < j < \w\. We note w^j] □ w. 
The empty word is denoted by e. Given two languages C,D C A*, CD 



denotes their concatenation, C + D their union, C* = (J C n , and C u = 

n£N 

{ z G A fi \ Vj G N, 3k > j, Z[o,fc-i] G G* }. When no confusion is possible, given 
a word u>, we also denote w the language {w}. 

Configurations. A configuration is a biinfinite sequence of letters x G j4 z . The set 
^4 Z of configurations is the phase space. The definition of factor can be naturally 
extended to configurations: for iei z and « < j, xjj^-j = Xi . . . Xj C x. If w G A*, 
then is the infinite word consisting in periodic repetitions of u, and w u w is 
the configuration consisting in periodic repetitions of u. 

Topology. We endow the phase space with the Cantor topology. A base for open 
sets is given by cylinders. For j, k G N and a finite set W of words of length j, we 
will note [W]k the cylinder { w G A z \ W[k,k+j-i] <= }. [W]% is the complement 
of the cylinder [W]k- 

Cellular automata. A (one-dimensional) cellular automaton is a parallel syn- 
chronous computation model consisting in cells distributed over a regular lattice 
Z. Each cell has a state in the finite alphabet A, which evolves depending on 
the state of their neighbors according to a local rule f : A d — > A, where m G Z 
and c? G N* are the anchor and the diameter of the CA, respectively. The global 
function of the CA is F : A z — > ^4 Z such that F(ir)j = /(^[i-m^-m+d]) for 
every x e A z and i G Z. The space-time diagram of initial configuration .t G A z 
is the sequence of the configurations of the orbit (F^ (x))j e m. Usually they are 
graphically represented by a two-dimensional diagram like in Figure 1. 

The shift map a : A z — > A z is a particular CA global function defined by 
a(x)i = X{ + i for every x G A 2 and i G Z. According to the Hedlund theorem [8], 
the global functions of CA are exactly the continuous self-maps of A z commuting 
with the shift map. 

Any local rule / of a CA can be extended naturally to an application on 
words f(w) = (/(tO[i,i+d-i]))o<»<M-d, for all w G A*A d . 

Dynamical systems. A dynamical system is a couple (X, F) where A is a set 
called the phase space, and F : X — > X is a continuous self-map. (Y, F) is a 
subsystem of (A, F) if Y is a closed F-stable subset of X. A set Y is F -stable if 
F(Y) C Y. 

Morphisms. A morphism of the dynamical system (A, F 1 ) into the dynamical 
system (Y, G) is a continuous map <fi : X — > Y such that 0oF = G o <f). A 
conjugacy (resp. a factorization) is a bijective (resp. surjective) morphism; in 
that case we say that (Y, G) is conjugate to (resp. a factor of) (A, F) (we expect 
the reader not to confuse between the use of the word "factor" in the dynamical 
system context and the language theory one). 



Subshifts. The onesided shift, also noted a is the self-map of A N such that 
a(z)i — Zi+i, for every z G A N and i G N. A onesided subshift (or sim- 
ply a subshift) S C A N is a cr-stable closed set of infinite words. The lan- 
guage of S is C(S) = {ii)6i*|]z6^,«iCz}, and characterizes 17, since 
r = { z G A N | Vw C z, w G £(17) }. The alphabet of the subshift is the set 
{ a G .A | 3z G 17, az G 7>7 } i.e. the set of letters that appear in infinite words be- 
longing to S. A subshift S is transitive if for every words u, v G there ex- 
ists w £ A* such that www G C{S). A subshift can also be characterized by a lan- 
guage T C A* of forbidden words, i.e. such that 27 = {z e #| Vw £ J,i z}. 
A subshift is of finite type (SFT for short) if it has a finite language of forbidden 
words. It is a fc-SFT (for k G N) if it has a finite set of forbidden words of length 
k. A subshift S is sofic if £(27) is a regular language. The following characteriza- 
tion of sofic subshifts will be very useful in the sequel. For more about subshifts, 
see for instance [9]. 

Theorem 1 (Weiss [10]). A subshift is sofic if and only if it is a factor of a 
SFT. 

Hcdlund's theorem can be extended as follows. 

Theorem 2 (Hedlund [8]). A function <f> is a morphism of a subshift (X,a) 
on alphabet A into a subshift (Y, a) on alphabet B if and only if there is a radius 
r G N and a local rule f : A r+1 -> B such that Vj G N, Va; G X, <j>(x)j = 
f(xj . . . Xj +r ) (we say <p is an r -block map). 

3 Traces 

In this section we define the main notion introduced in the paper, namely, the 
trace of a CA and the traceability of a subshift. Moreover, we give a simple 
necessary condition for a subshift being traceable. 



T F (x) 




Fig. 1. The trace seen on the space-time diagram. 



Definition 1 (Trace). Given a CA F, the trace applications are defined for 
k G Z by Tp(x) = (F ] (x)) je ^. In other words, Tp(x) is the k th column of the 
space-time diagram of initial configuration x (see Figure 1). We note Tp = T F . 
We say that Tp(x) is the trace of F with initial condition x. 



The study of trace applications can be reduced to the study of Tp, because of 
shift-invariance of CA. 

It can be noticed that this notion of trace corresponds to the /3-trace as 
defined in the introduction, with [3 = { [a] \ a E A } being the partition of A z into 
cylinders of width 1. 

Definition 2 (Traceability). The trace subshift of a CA F is t(F) = Tp(A Zl ). 
It is a factor subshift of (A z , F), since Tp is continuous and commutes with a. 
A subshift £ is traceable if there exists a CA F for which £ = t(F). 

We begin with a condition for the traceability of a subshift. Proposition 2 
proves that it is necessary. 

Definition 3 (TO subshift). A subshift is TO if it includes a 2-SFT with the 
same alphabet. 

Proposition 1. A subshift is TO if and only if there exists a map '■ A — > A 
such that for every letter a E A, (<j>> : (a))j e ^ E £. 

Proof. If Va G A, (0?'(a)) jeN G J 1 , then { (<(>> (a)) jeN \ a G A } is a 2-SFT of al- 
phabet A and set of forbidden words {J aeA a(A \ {0(a)}). Conversely, consider 
a 2-SFT r of alphabet A. Define 4>{a) = b such that ab G C(r). Since f is a 
2-SFT, we have that Va G A, (<jp (a)) jeN E T. □ 

Example 1. Consider the following subshifts : 

-£ = {!+ e)(01) w ; it is finite TO (with (f)(0) = 1 and 0(1) = 0); 

- E = {) u + 0*1"; it is infinite TO (with 0(0) = 0(1) = 1); 

- £ = (01 + 1 + e)(001) w ; it is finite but not TO. 

Proposition 2. The trace subshift of a CA is TO. 

Proof. Consider a CA F. For any a E A, define 0(a) as F ( w a") . Then 
(0 3 (a)) jeN = Tp( w a w ) G t(F). By Proposition 1, t(F) is TO. □ 

Theorem 3. Any 2-SFT is traceable. 

Proof. Consider a 2-SFT £ and let A be its alphabet. For each a E A, define 
0(a) = b such that ab E £■(£). Consider the CA of anchor 0, diameter 2 and 
local rule / defined by f(xo,xi) — x\ if x$x\ E £■(£), and 0(xo) otherwise. If 
x E A z 7 then the definition of the rule gives that every factor of length 2 of its 
trace is in C(£). Conversely, if z E £, then we can see by induction that £ is 
the trace of F with initial condition x as soon as X[ i+oc ) = z. □ 



4 fc-traceability 

In order to establish finer results we first need a weaker condition for traceability, 
namely fc-traceability. A subshift of alphabet A is fc-traceable if it is the set of 
columns of a CA on the alphabet A k . The difference between a CA tracing a 
subshift of alphabet A and one tracing a subshift of alphabet A k is that the rule 
of the latter can use the knowledge of the position of a letter of A in a word over 
A k . This results in a much simpler construction. 



Notation. If E is a subshift on an alphabet B C A k , and [0, k — 1], then the 
q th projection is defined as 

B N ^ A N 

7T ! 

q ' i z j)jeN !-» ((^-)?)jeN 

We also note 7r(.£) = Uo< g <fc 7r <j(^')' which is a subshift on A. 

In this section we wilF limit our study to onesided CA, i.e. with anchor 0. 

Definition 4 (fc-traceability). Given a CA F on the alphabet B C A k , the k- 
trace subshift is defined by r (F) = (J { ((F j {x) ) q ) j€N \ x G _B Z } = tt(t(F)). 

0<q<k 

A subshift is fc-traceable if it is the k-trace of a onesided CA on the alphabet 
B c A k . 

Similarly to what done in the previous section we give a necessary condition 
for being fc-traceable. 

Definition 5 (Tl subshift). A subshift S on the alphabet A is Tl if there 
exists k e N* and a 2-SFT T C (A k ) N , such that 7r (T) = tt(T) = E (in 
particular, E is a factor of T). 

Theorem 4. A Tl subshift is k-traceable for some k G N* . 

Proof. By Theorem 3, the corresponding _T C (j4 fc ) N is the trace of a CA F on 
some alphabet B C A k . Hence, r (F) = tt(t(F)) = tt(T) = E. □ 

Example 2. Consider the subshift E = (01 + 1 + e)(001) w . It is Tl (define the 
2-SFT r = {uvw) u on the alphabet B = {u, v, w} C A 3 , where u = 001, v = 010 
and w — 100). It is thus 3-traceable, but not traceable since it is not TO. 

Theorem 5. Any SFT is Tl. 

Proof. Let E be a fc-SFT for some k e N*. T = { (zy J - +fe _ 1 ] )jgn| z E E) is a 
2-SFT, and n(r) = U < 9<fe { (z j+q )jeM zeE} = \Jo< q <k^( s ) = S = MH- 

□ 

This result allows us to prove the next proposition, which is a less restrictive 
condition for being Tl. 

Proposition 3. A subshift E is Tl if and only if it is a factor of a SFT r on 
alphabet A k for some k e N* such that 7r(T) C E. 

Now we extend the results on fc-traceability to sofic subshifts (with some 
additional properties). 

Definition 6 (T2 subshift). A subshift is T2 if it is sofic and includes an 
infinite transitive subshift. 

Theorem 6. Any T2 subshift is Tl. 



The proof of Theorem 6 is given using the following lemmata. 



Lemma 1. A sofic transitive subshift on alphabet A is infinite if and only if for 
all n>2, it includes a subshift with B C A k , \B\ > n and some k 6 N*. 

Lemma 2. If £ is a factor subshift of a SFT T on alphabet B such that B^ C 
£, then £ is Tl. 

Proof (of Theorem 6). Let £ a T2 subshift. From Theorem 1, it is a factor of 
a SFT r. Thanks to Lemma 1, there is an arbitrarily large set of words B on 
A such that B u C £, so we can assume without loss of generality that T is 
a subshift on such an alphabet B C A k for some k G N*. From Lemma 2, we 
conclude that £ is Tl. □ 



5 From fc-trace to trace 

In the previous section, we gave a sufficient condition for a particular subshift 
£ to be fc-traced by a CA G on an alphabet B C A k . In this section, we show 
how to simulate G with another CA, on alphabet A, in such way that its trace 
is £. This can be done if we add a further condition to our subshift. 

Definition 7 (T3 subshift). A subshift £ is T3 if there is a map <f> : A — > A 

such that for every letter aei, (<j>> ' (a))j e jq G £ (it is TO) and there is a word 
w G A* \ 4>{A)* such that G £. 

Example 3. Consider the following subshifts. 

- £ = (l + £)(01) w is not T3. 

- £' = CT + 0*l w is T3 (with (f)(0) = 0(1) = landw = 0). 

Theorem 7. Any T3 k-traceable subshift (for some k G N*J is traceable. 

This section presents a sketch of the proof of Theorem 7. Remark that it is 
well known that a CA on any alphabet can be simulated by a CA on any other 
alphabet (with at least two letters), provided that its diameter is wide enough. 
In particular, any CA on B C A k can be simulated by a CA on A. Each cell 
can see its neighborhood as words of A k and evolve accordingly. The problem 
is that all cells must have the same local rule, so they have to find from the 
neighborhood which column of the A k simulation they are representing. This is 
usually done using a special border word to delimit the words of A k . 

In this section, £ denotes a T3 subshift on alphabet which is fc-traceable by 
a onesided CA G. Let (f> and w be as in Definition 7. Assume G has diameter 2 
(the construction can easily be generalized) and local rule g : B 2 — > B. 

In order to achieve the simulation, we first define border words to delimit A k 
cells. We have two execution modes: a simulation mode will simulate properly 
the execution of the CA on alphabet B, and a default mode will be applied if 
the neighborhood contains invalid information. This adds some issues: default 
evolution must be in £; border evolution must also evolve according to £; and 



we have to ensure that when a mode is applied to a cell, the same mode keeps 
being applied there in the following generations, since a change of mode would 
produce an invalid trace. These problems will be solved in the three following 
subsections. 



5.1 Borders 

In order to make our simulations, we need to delimit computation zones. This 
is obtained by using some special words called borders and defined as follows: 



Y 



ja^W^M a G <j>(A),v £ 7 (w) } C A 1 , 



where O 7 (io) = {"f q (w)\ < q < \w\ }, and I = k + 6 \w\. 

Borders have the property that they cannot have a too wide overlap. 

Border evolution. The border words of T must have an evolution in E. The 
following rule (of diameter 1) respects that condition: 

T -> T 

r 1 al™W a fc+3M ^ (f)(a)^j(v)^{v)(j){a) k+3 ^ ' 

Macrocells. We will decompose our configurations into macrocells. A macroccll 
is the concatenation of a border word and a valid word of B. We can simulate 
the local rule g by a macroevolution rule (local rule on macrocells of BT C A h , 
where h = k + I = 2k + 6 \w\, and of diameter 2): 

(BT) 2 BT 



A : 



(U,V) !->■ ff^IO.fe-ll^IO.fe-l])^^^^-!]) 



5.2 Default mode 

Our CA will work as follows. Valid zones (with macrocells), which evolve accord- 
ing to the macroevolution rule A so that they remain valid zones. Invalid zones 
run a microdcfault mode so that they remain invalid zones. Nevertheless, fron- 
tiers between the zones must not move. In the frontiers, a macrodefault mode is 
applied in order for a macrocell to have the opportunity to evolve without tak- 
ing into account its neighbors; that way, each cell will keep the same execution 
mode. 

Macrodefault mode. To do so, we extend the macroevolution to a function on 
0A h , where 9 = BTA h \ (J A l BTA h ^ because it does not take into account 

0<i<h 

overlapping macrocells. This is crucial in order to define a local rule. If the central 
macrocell has a neighbor macrocell in O, we apply a simulation step of the CA. 
Otherwise, we evolve as a macrodefault mode (simulation from a monochromatic 
configuration): 

OA h BT 



A : 



fl , (w[o,fe-i],«[/ l ,/ l +fe-i])^r(w[ fe;h _i]) if u e BTO 
g(u[o,k-i],U[ ,k-i])^r(u[k,h-i]) otherwise 



Microdefault mode. The function </> (corresponding to the fact S is TO) allows 
to define a microdefault mode for a neighborhood that does not contain any 
macrocell. We are now able to transform the function A into a local rule on A. 
Indeed, we can define, for anchor m = h — 1 and diameter d = 3h — 1: 



f w 



A d -> A 

A(u)i ii we A m - l uA\ where u G 0A h , i <= [0, h - 1] 
0(u>o) otherwise 



since such an integer i, and such a word u would be unique (from the construction 
of 0). This local rule is such that f(A m uA m ) — A(u) for every u G 0A h , which 
is what we wanted: it can simulate in one step the behavior of our CA on B. Let 
F be the corresponding global rule. 

The following lemma guarantees that no column changes its evolution mode. 

Lemma 3. The preimage of cylinder [BT] is cylinder [0] . Moreover, cylinder 
[0] and its complementary [0] c are F -stable (in particular, we cannot create a 
border). 

A configuration which is a valid encoding of some y G B z (simulation mode) , 
then its trace is some projection of the trace of y. Otherwise, microdefault and 
microdefault mode also produce a trace which is in S. This concludes the proof 
of Theorem 7. 



6 Examples 

Example 4 (Finite untraceable Tl subshift). No CA traces subshift S = 
{0", (01)", (10)"}, even though it is Tl. 

Example 5 (Tl, T3, non-SFT, non-T2 subshift). The subshift S = (0*1 + 1*)0" 
is neither a SFT nor T2, but it is Tl and T3. Hence, by Theorems 4 and 7 it is 
traceable. 

Example 6 (Traceable non-T3 subshift). Let / be the local rule of anchor 3 and 
diameter 7 such that /(u_ 3 000111) = 1, /(000111u 3 ) = 0, /(u_ 3 001011) = 0, 
/(OOIOII113) = 1, and /(it) = uo otherwise. The trace subshift of the corre- 
sponding CA is t(F) = {0 W , (10)", (01)", 1"}. In this case, t(F) is finite but not 
T3. 

Example 7 (Traceable non-sofic subshift). Let F be the CA on alphabet A = 
{b,r,l,w} (the white, the right, the left and the wall particles, respectively) 
defined by the following local rule / of anchor 1 and diameter 3: 



X-iXqXi 


rll 


Irl 


rll 


Iwl 


?rw 


wll 


r ?7 


??/ 


??? 


f(x- 1 x x 1 ) 


w 


w 


w 


w 


I 


r 


r 


/ 


b 



where ? stands for any letter in A and the first applicable rule is used (left to 
right). Then, t(F) is not sofic. 



7 Putting things together 



In this paper, we have given sufficient conditions for a subshift to be the trace 
of a CA. 

TI fc-traceable (Theorem 4) 
SFT TI (Theorem 5) 
T2 TI (Theorem 6) 
T3 and fc-traceable =>■ traceable (Theorem 7) 
The following summarizes all these results: 

Theorem 8. Any T3+T1, T3 SFT, or T3+T2 subshift is traceable. 

The present result follows other works on the structure that the trace of a 
CA can have. Here, we take the problem the other way around: we construct a 
CA that traces a particular kind of subshifts. Though we do not have a necessary 
and sufficient condition, Conditions TO, TI, T2 and T3 are a first step toward 
a better understanding of what makes a sofic subshift traceable or not. 

Moreover, we expect our construction to be generalizable to weaker condi- 
tions. Nevertheless, the non-sofic case is still obscure. The general feeling is that 
it needs a completely different approach. 

In [7] , it is proved that every factor subshift of a C A F is a factor of some (3- 
trace, where (3 is a partition of A z , and every /3-trace is a factor of some column 
factor (i.e. a subshift { (F- 7 (x)[o. 9 _i])jeN| x £ ^ }, for some q G N). Hence we 
can wonder now whether this kind of result can be generalized, in particular to 
the canonical factor { (-F-?(x)[o j( i-i])jeN| x <E A z } of the CA. 
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A Proofs of Section 4 



Proof (of Proposition 3). Assume that E is a Tl subshift; then there is a 2-SFT 
r such that TTo(r) = E. As 7r is a factorization, E is a factor of the SFT _T. 

For the converse implication, assume that E is a subshift on alpha- 
bet A, r a SFT on alphabet B C A k , and : T — > 17 a factoriza- 
tion, where fc G N and 7r(/ n ) C E. We can suppose without loss of gen- 
erality that ir{r) = TTo(r) = E, should we replace it by its conjugate 
{ (a,jWj)j e pt £ (AB) n \ (wj)j e pt £ r and (aj)j £ n = 4>{{wj)j e fi) }, which is still a 
SFT. From Theorem 5, r is Tl, i.e. there exists a 2-SFT P on alphabet C £ B l 
for some I £ N such that wo(r') = n(r') = r. We have just used the projections 
with respect to alphabet B (recall that C C B l C (A k ) 1 ), which projections 
with respect to A itself are 7r (-T) = tt(_T) = E. Hence, with respect to A, we get 
7r (r') = Tr(r') = 27, and E is Tl. □ 

Proof (of Lemma 1). If |£?| > 2, then 23" is infinite (since all words of B have 
the same length). Conversely, if E is an infinite transitive sofic subshift, and 
(A, Q, 5) a deterministic finite automaton which recognizes the regular language 
C(E), then there is a state q G Q and two letters a,b £ A such that S(a, q) and 
S(b,q) are defined (otherwise there would only be one possibility to extend the 
word from each state, and there would be at most as much infinite words in E 
as initial states in Q). By transitivity, there is a word u and a word v such that 
5(u,5(a,q)) = 5(v,S(b,q)) = q. If n = \au\ ■ \bv\, u = (au) |6c| , v = {bv)^, and I 
is such that 2 l > n, then B — {u,v} 1 C (A n ) 1 is appropriate. □ 



A.l Proof of Lemma 2 

The function H defined in the following lemma allows us to define a clock of 
period 2n: the word H(i) represents the i-th clock tick. 

Lemma 4. Consider an alphabet B C A n , with \B\ > 2 and n e N*. 
Then there is an injection H : [0, In — 1] — ► A 3n such that the subshift 
r = { (H((q+j) mod 2n)) jeN \ qeN} verifies n(r) C B u . 

Proof. Let u,v £ B, suppose uo ^ vo (other cases are obtained by rotation), and 
define: 

. [0, 2n - 1] A 3n 

h i ^ ~/ h (u)j h (uv) 

Assume H(h\) = H{h,2) for some distinct h\,h,2 £ [0,n — 1]. Let /i = 
min(/ii — /12 mod 2n, /12 — ^1 mod 2n) G [1,^]- Then, by the definition of H, 
-f hl (u)j hl (uv) = -f h2 (u)j h2 (uv). Hence, -f hl (u) = -f h2 (u) and j hl (uv) = j h2 (uv), 
giving u = j h (u) and uv = -f h (uv). Finally, 

MO = j' l (u)n-h = Un-h = {uv) n - h = J h (uv) n - h = (uv) n = V . 

This is a contradiction, hence H is injective. 



Let q G [0, 2n - 1] and p G [0,n - 1]. Then, 

7r p ((ff(( g + j) mod 2n)) jeN ) - ( 7 (9+ ^ mod 2 »),eN = mod „) jeN 

= (t 9 (m w ) e s w . 

Similarly, taking the projection p in [n, 3n — 1] we find 

TT p ((H(q + j mod 2n)) jeN ) = (j q+1 mod 2n (uv)) jeN = ((uv) q+J mod 2n ) jeN 

= a q {{uvY) G B u . 

□ 

Lemma 5. A full shift (B N ,a), where B C A n and n G N* , is a factor of a 
SFT <P on alphabet A in such that n(\P) C O a (B w ), seen as a subshift on A. 

Proof. For < q < n, we define the set of time-q encodings as 

Psi q = { ((wo),, K-i^-eN e (A") N | Vp G [0,n - 1] , Wp G a«» d "(B u ) } 

and the decoding function as: 

(A • q 

q (( w o)j ■ ■ ■ {w„-l)j)j 6Nh modn)[0,n-l])j€N 

<6g is surjective. If we apply a, we pass from a time-g encoding to a time-(g + I) 
encoding: a(W q ) = ^q+imodn- In order to know which step we are in (which 
of the first n columns to look at), we use a "clock" represented by the last 3n 
columns as described in Lemma 4. We use the set: 

q '= (J {( W 3> H (Q+3 m ° d 2n ))jeN e (A 4 ") N | {Wj) jeN G fgmodn } , 
a<q<2n 

which is a disjoint union from Lemma 4, and has the advantage of being a 
subshift, since a(<P qmodn ) = <P 9+ i mo d ™ and a((H(q+j mod 2ra))_ 7 - eN ) = {H(q + 
I + j mod 2n))j £ N. The decoding 

' (wj,H(q + j mod 2n)) jeN mod n)((wj)jeN) 

is a factorization deriving from the n-block map ((iOo)j • ■ ■ ( w n-i)j,H(q + j mod 
2n))o<j<« !-»■ modn ))[ , n _i], since 4>(W) D <M*b) = # N - By definition of 

\P q and Lemma 4, the projections are in B u . Last point, we can see that <P" is an 
n-SFT, since: 

{(w )j ...(w 4n -i)j)jeN 

3q G [0 2n - 1] { ^ W °^ j ' ' ' ^ Wn -^^ N G ^ mod " 

J ' \ (K)j • • • (w4n-i)j)jeN = (H(q + j mod 2ri)) jeN 



Vp G [0,n - 1] , w p G cr^ p mod n (B w ) 

Vj G N, (to„)j . . . [W4n-i)j = H(q + j mod 2n) 



where q = H 1 ((w„)o • • • (u>4n-i)o) 

' \ (w n )i . . . (iU4n-i)i =H(q + l mod 2n) 
where g = H~ 1 ((w n ) . ■ ■ {w 4n -i)o) 

a 

Proof (of Lemma 2). Let £ be a factor subshift of a SFT r on alphabet -B 
such that B u C i7. By Lemma 5, the full shift B n , where £? C A", is a factor 
of a SFT <F on alphabet A 4 " such that = O a {B u ). Let : <F -> B N 

the corresponding factorization, r 1 is a factor of the subshift r' = _1 (.r). Of 
course, 7r(T') C 7r(^) = 0(B U ). Moreover, it is a SFT too, since w G J" 4^ to G 
^ and 0(iu) G T. To sum up, £ is a factor of the SFT r' such that Tr(r') C 
Hence, by Proposition 3, S is Tl. □ 



B Proofs of Section 5 

Remark 1. From the definition of w, we immediately notice that border words 
have at least one letter that is not in (f>{A): [T] r#(A) H ] H = [T]n[0(A)H] 2W = 
0. 

Here is a formalization of the property that border words cannot overlap each 
other too much: 

Definition 8 (Freezingness). A language W C A h is k- freezing, for some 
integers k, h G N, if cylinders [W] and [W]i do not intersect for any i e [1, k}. 

Remark 2. W is fc-freezing if and only if [W]i D [W]j = if \j — i\ e [1, k] (we 
will not have overlapping border words sharing more than k letters) . 

Lemma 6. The set Y of borders is (k + 3 \w\) -freezing. 

Proof. 

- If \w\ < i < k + S\w\, Remark 1 gives us [T]n[T], C [(t>(A) k + 3 ^] 3H n [T]; C 
[0(i4)M] i+H n [T]j = 0. 

— Suppose there are an integer i G [0, \w\ — 1], words u, v G C 7 (w) and a con- 
figuration x G [</-(,4)l" , lwu</)( J 4) fc +l"'l]n[< ? !.(^)l tu lw0(^) fc+ l" , l] l . Here we use the 
symmetry of words uu and w. Let p = min { j e [0, 1 — 1] | Xj + |„| £ 0(A) }• 
On the one hand: 

p = min { j G [0, |w - 1] | u } £ <f>{A) } 
= \w\ - max{j G [0, |w| - 1] \u 3 £ 0(A) } (1) 
= 2 \w\ - max { j G [0, 1 - 1] | £ 0(A) } . 

On the other hand: 

p = i + min { j G [0, \w\ - 1] | Wj £ 0(A) } 
= i+ \w\ -max{j G [0, |w| - l]\wj £ 4>{A)} (2) 
= i + 2\w\ - max{j G [0, 1 - 1] ^ £ <j)(A) } . 

Combining Equalities (1) and (2), we get i = 0. 



□ 

The following lemma grants the columns that it produces are in E. 

Lemma 7. For every border b G T, and every column i G [0, 1 — 1], the infinite 
word ((A 3 T (b))i)j e ^ is in subshift S. 

Proof. Consider a border word b = a} w \^ (w)ji (w)a k+3 \ w \ . 

- If i G [0, M - 1] U [3 H ,1 - 1], then ((^(6))i) jeN = (^(a)) j6N G 

— If i G [|ty| , 3 |tu| — 1], then by a direct induction one finds that: 

(04-(&))Oj e N = <T |w| -l 2|u ' | - i+J 1 K) g r . 

□ 

Remark 3. is (ft — l)-freezing. 

The following lemma grants that in particular we will be able to apply func- 
tion A if there are two macrocells in the neighborhood. 

Lemma 8. BYBY C 0. 

Proof. First, if < i < k + 3 \w\, then [BY BY C\ A 1 BY A h ~ l ] C [T] fc n[T] fc+4 =0 
(from Lemma 6). 

Similarly, if fc + 3 \w\ < i < h, then [BY BY f\ A { BY A h ~ { \ C [r] h+fe n [Y] i+k = 
0. □ 

Proof (of Lemma 3). From the definition of / in execution mode and of A, we 
notice that: 

F{[0\) c [ST] (3) 

The apparition of a letter which is not in <fi(A) would mean we are in the left 
part of an execution mode around that cell: if x G A z is such that F(x)o £ (f>{A), 
then there is a j G [0, k + 3 \ w\ — 1] such that x G 

F-i([cP(A)] c )c |J [0}.j (4) 

<3<j<k+3\w\ 

Now if a configuration x has an image in cylinder [BY], then, from Remark 1, 
there is a cell g G [fc + |w| , fc + 2 |w| — 1] such that F(x) g £ 4>{A). Hence, com- 
bining with Equation (4), with i = g — j: 

F-\[BY])C (J [0], (5) 

-2|tu|<»<fc+2|u>| 

Let x G F~ 1 ([BT]). Then x G [0] t for some i G [-2 \w\ + 1, fc + 2 |w| - 1] (from 
Equation (5)). Then F(x) G nF([0],) G [T] fe n [Y] k+l (from Equation (3)). 
T being (k + 3 u>|)-freezing (Lemma 6), we can conclude that i = 0. 

F-^flT]) C [0] (6) 



Combining with Equation (3), we get: 



F-H{BT]) = [0] (7) 

Another consequence of Equation (6) is the stability: 

F([Of) C [BT] C c [Of (8) 

Let x G [9]. In particular x £ [9]i for < i < h Hence, F(x) £ [BT]i (from 
Equation (6)), and from Equation (3), F(x) G [BT]. Finally, F(x) G [9]. 

F([9}) C [O] (9) 

□ 

Lemma 9. Let y € B . and a; G A z sucft iftai Vz G Z, ^[i/j^/j+fc-i] = yi E B and 
X[ih+k,(i+i)h-i] G ^ften for < q < k, Tj,{x) = n q (T G (y)). 

Proof. We can prove by induction on the generation j G N, that for any i G N, 
F J (x) G [G J This property holds for j = 0. Now suppose it is true 

at generation j G N, and let us prove it for time j + 1. Let i G N. F 3 \x) G 
[STBTjj^ G 6* from the induction hypothesis and Lemma 8. Therefore we are 
in execution mode between cells ih and (i + l)h: 

^ i+1 («)[o,fc-i] = ^(F j {x) [ih)(i+1)h - 1] ,F j (x) [{i+1)h!{i+2)h -i ] ) 
= A(F 3 (y) i ,F 3 (y) i+1 ) 
= F 3+1 (y) i . 

In particular, T F (x) = (F 3 (y)o)jen = ^o(tf(u))- □ 

Lemma 10. The trace of every configuration x G A 1 is in £ . 

Proof 

— First, if x ^ [©]-» for any i G [0, ft- — 1], then we will always (see Lemma 3) 
apply the default mode in cell 0: Vi G [0, h - 1] ,Vj G N, F 3 (x) £ [©]_;, and 
by a trivial recurrence, Tp(x) = (F J (x)o)jeN = {ft ( x o))jm G 

— If x G for some i G [fc, ft — 1], then from Lemma 9, Vj G N, F 3 (x) G 
[6>]_i G [T] k -i, and by recurrence, F-?(x) = Ar(x[k-i,h-i-i])i-m, so 
Tp(x) = (F 3 (x) ) jeN = (Z\^(x [fe _ i j l _ i _ 1] ) i _ m ) :) eN G £ by Lemma 7. 

— If x G [<9] g /j_i for some i G [0, A; — 1] and every g G N, then from Lemma 9, 

T F (z) gt (G). 

— Otherwise, there is some i G [0, k — 1] and some q G N* such that x G 
[9 q 9 c ]- U and thanks to Lemma 3, Vj G N,F 3 (x) G [6>«<9 c ]_ t . Let y G B N 
a configuration such that for < p < q, xtph-i^h+k-i-i] = J/p an d f° r P — 9i 
VpUg-i- Then we can show by induction on j G Z that for < p < q, 
F 3 (x)[ph-i, P h+k-i-i\ = G k (y)i, from the definition of /. Hence, T F (x) = 
Ki(T G (y))- 



□ 



C Proofs of examples 



Proof (of Example 4)- By contradiction, assume that such a CA F exists. Let / 
be the corresponding local rule, m its anchor and d its diameter. Being surjec- 
tive (we can immediately check that for any configuration x e A z , F 2 (x) = x), 
it is balanced i.e. Va G A, |/ _1 (a)| = \A\ d \ In particular, |/ _1 (1)| = 
\A m QA d - m - l \. Moreover, from the definition of E, C A m QA d - m - 1 . 

Equality of cardinals gives A m QA d - m -' L = Hence W i E, which is 

a contradiction. □ 

Proof (of Example 5). 10* and 0*1 are included in the language of (0*1 + 1*)0 W , 
but not 10*1; hence E is not a SFT. E contains two distinct transitive subshifts, 
namely W and l w , both of which are finite; hence it is not T2. If <p is defined 
by Va G {0, 1}, <f>(a) = 0, then W , 1" G E and w = 1 £ {0, 1}* \ W . Hence Z 1 is 
T3. Finally, build a 2-SFT T = ((0, 1)*(1, 1) + (1, 1)*)(0, 0) w C ({0,1} 2 )", and 
remark that 7To(^) = ^; therefore E is Tl. □ 

In order to prove Example 7, we first need to prove the following lemma. 
Lemma 11. Let A and F be as in Example 7. Then 

C(T(F))r\lb*rb*lb*r= |J (lb 2p rb 2q lb 2p r) . 

Proof. Remark that the the wall w is invariant and blocking. Now we can prove 
that the set of configurations whose trace has a prefix in lb*rb*lb*r is 

T^ytfrVltfrA") = (J [wPlWwl-p-L 

p,qeN 

Intuitively, if there were no wall on the left, no particle would come from the 
left, since right particles cannot cross left particles. Similarly, there must be a 
wall on the right. 

Fix p, q G N. For any j G N, let i = j — p — 1 mod 2(p + q + 1) and 
Sj = F° {[wb p lb q w]- p -i). By induction on j one can easily prove that Sj C 
[wb i W +q - i w\- p - 1 if i < p+q and Sj C [wb^+i^-Hb^-^w]^^ otherwise. 
Hence, the trace of a configuration x G [wb p lb q w\- p -x is Tp(x) = (lb 2p rb 2q ) . 

□ 

Proof (of Example 7). By Lemma 11 we have that 

L = C{r{F))r\lb*rb*lb*r = (J (lb 2p rb 2q lb 2p r) . 

p,qeN 

Since lb*rb*lb*r is regular and L is not, we conclude that C(t{F)) is irregular. 

□ 



